9 research outputs found

    Tweets as data: Demonstration of TweeQL and TwitInfo

    Get PDF
    Microblogs such as Twitter are a tremendous repository of user-generated content. Increasingly, we see tweets used as data sources for novel applications such as disaster mapping, brand sentiment analysis, and real-time visualizations. In each scenario, the workflow for processing tweets is ad-hoc, and a lot of unnecessary work goes into repeating common data processing patterns. We introduce TweeQL, a stream query processing language that presents a SQL-like query interface for unstructured tweets to generate structured data for downstream applications. We have built several tools on top of TweeQL, most notably TwitInfo, an event timeline generation and exploration interface that summarizes events as they are discussed on Twitter. Our demonstration will allow the audience to interact with both TweeQL and TwitInfo to convey the value of data embedded in tweets

    Processing and visualizing the data in tweets

    Get PDF
    Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter

    TwitInfo: Aggregating and Visualizing Microblogs for Event Exploration

    Get PDF
    Microblogs are a tremendous repository of user-generated content about world events. However, for people trying to understand events by querying services like Twitter, a chronological log of posts makes it very difficult to get a detailed understanding of an event. In this paper, we present TwitInfo, a system for visualizing and summarizing events on Twitter. TwitInfo allows users to browse a large collection of tweets using a timeline-based display that highlights peaks of high tweet activity. A novel streaming algorithm automatically discovers these peaks and labels them meaningfully using text from the tweets. Users can drill down to subevents, and explore further via geolocation, sentiment, and popular URLs. We contribute a recall-normalized aggregate sentiment visualization to produce more honest sentiment overviews. An evaluation of the system revealed that users were able to reconstruct meaningful summaries of events in a small amount of time. An interview with a Pulitzer Prize-winning journalist suggested that the system would be especially useful for understanding a long-running event and for identifying eyewitnesses. Quantitatively, our system can identify 80-100% of manually labeled peaks, facilitating a relatively complete view of each event studied

    Modeling tax evasion with genetic algorithms

    Get PDF
    The U.S. tax gap is estimated to exceed $450 billion, most of which arises from non-compliance on the part of individual taxpayers (GAO 2012; IRS 2006). Much is hidden in innovative tax shelters combining multiple business structures such as partnerships, trusts, and S-corporations into complex transaction networks designed to reduce and obscure the true tax liabilities of their individual shareholders. One known gambit employed by these shelters is to offset real gains in one part of a portfolio by creating artificial capital losses elsewhere through the mechanism of “inflated basis” (TaxAnalysts 2005), a process made easier by the relatively flexible set of rules surrounding “pass-through” entities such as partnerships (IRS 2009). The ability to anticipate the likely forms of emerging evasion schemes would help auditors develop more efficient methods of reducing the tax gap. To this end, we have developed a prototype evolutionary algorithm designed to generate potential schemes of the inflated basis type described above. The algorithm takes as inputs a collection of asset types and tax entities, together with a rule-set governing asset exchanges between these entities. The schemes produced by the algorithm consist of sequences of transactions within an ownership network of tax entities. Schemes are ranked according to a “fitness function” (Goldberg in Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Boston, 1989); the very best schemes are those that afford the highest reduction in tax liability while incurring the lowest expected penalty.Mitre Corporation (Innovation Program

    Simulating tax evasion using agent based modelling And evolutionary search

    No full text
    Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.7Cataloged from PDF version of thesis.Includes bibliographical references (page 61).We present a design and model for Simulating Co-Evolution of Tax and Evasion (SCOTE). The system performs agent based modeling of the tax ecosystem and searches for tax evasion strategies using a variant of a Genetic Algorithm with a grammar. Current methodologies and tools to detect, discover or recognize tax evasion are not sufficient. In recent years the tax gap, the aggregate sum of the difference between the tax owed in principle and tax paid in practice was calculated to exceed 450 billion dollars. Numerous tax evasion schemes have surfaced that perform seemingly legal transactions but once observed closely their sole purpose is to reduce tax liability. Moreover, these schemes are evolving with time. Whenever a scheme is detected and eliminated by fixing a loop hole in the tax code, others emerge to replace it and currently there is no systematic way to predict the emergence of these schemes. SCOTE allows us to encode tax evasion strategies into a searchable representation. SCOTE has three major components namely the Genetic Algorithm library(GA), the interpreter and the Parser. The GA encodes transaction plans into an integer representation and performs search over the transaction plans to find a scheme that produces the maximum tax gap. The Parser performs grammatical mapping of list of integers to a transaction plan.The interpreter models the tax ecosystem into a graph where the entities such as taxpayer and partnerships are nodes and the transactions between entities are the edges. Each entity has a portfolio of assets and the values of the assets are updated after a transaction. The interpreter runs a transaction plan generated by GA on the graph to produce the tax gap. We ran two experiments using two of the known tax evasion schemes namely "Son of Boss" and "iBOB" and we were able to detect the two schemes using SCOTE.by Osama Badar.M. Eng

    A study on the effect of bioactive glass and hydroxyapatite-loaded Xanthan dialdehyde-based composite coatings for potential orthopedic applications

    No full text
    Abstract The most important challenge faced in designing orthopedic devices is to control the leaching of ions from the substrate material, and to prevent biofilm formation. Accordingly, the surgical grade stainless steel (316L SS) was electrophoretically deposited with functional composition of biopolymers and bioceramics. The composite coating consisted of: Bioglass (BG), hydroxyapatite (HA), and lawsone, that were loaded into a polymeric matrix of Xanthan Dialdehyde/Chondroitin Sulfate (XDA/CS). The parameters and final composition for electrophoretic deposition were optimized through trial-and-error approach. The composite coating exhibited significant adhesion strength of “4B” (ASTM D3359) with the substrate, suitable wettability of contact angle 48°, and an optimum average surface roughness of 0.32 µm. Thus, promoting proliferation and attachment of bone-forming cells, transcription factors, and proteins. Fourier transformed infrared spectroscopic analysis revealed a strong polymeric network formation between XDA and CS. scanning electron microscopy and energy dispersive X-ray spectroscopy analysis displayed a homogenous surface with invariable dispersion of HA and BG particles. The adhesion, hydrant behavior, and topography of said coatings was optimal to design orthopedic implant devices. The said coatings exhibited a clear inhibition zone of 21.65 mm and 21.04 mm with no bacterial growth against Staphylococcus aureus (S. Aureus) and Escherichia coli (E. Coli) respectively, confirming the antibacterial potential. Furthermore, the crystals related to calcium (Ca) and HA were seen after 28 days of submersion in simulated body fluid. The corrosion current density, of the above-mentioned coating was minimal as compared to the bare 316L SS substrate. The results infer that XDA/CS/BG/HA/lawsone based composite coating can be a candidate to design coatings for orthopedic implant devices

    Effect of Wet Aging on Color Stability, Tenderness, and Sensory Attributes of Longissimus lumborum and Gluteus medius Muscles from Water Buffalo Bulls

    No full text
    The present study aimed to investigate the effect of wet aging on meat quality characteristics of Longissimus lumborum (LL) and Gluteus medius (GM) muscles of buffalo bulls. Meat samples from six aging periods, i.e., 0 day (d) = control, 7 d, 14 d, 21 d, 28 d, and 35 d, were evaluated for pH, color, metmyoglobin content (MetMb%), cooking loss, water holding capacity (WHC), myofibrillar fragmentation index (MFI), Warner–Bratzler shear force (WBSF), and sensory evaluation. The pH, instrumental color redness (a *), yellowness (b *), chroma (C *), and MetMb% values were increased, while the lightness (L *) and hue angle (h *) values showed non-significant (p > 0.05) differences in both LL and GM muscles in all aging periods. The cooking loss increased while WHC decreased till 35 days of aging. MFI values significantly (p < 0.05) increased, while WBSF values decreased; in addition, sensory characteristics were improved with the increase in the aging period. Overall, the color, tenderness, and sensory characteristics were improved in LL and GM muscles until 28 and 21 days of aging, respectively. Based on the evaluated meat characteristics, 28 days of aging is required to improve the meat quality characteristics of LL, whereas 21 days of aging is suitable for GM muscle
    corecore